---
title: "Power Analysis"
author: "Michael F. Seese"
date: "Last Updated: `r format(Sys.time(), '%d %B %Y')`"
output: 
  html_document:
      toc: true
      toc_depth: 4
      toc_float:
        collapsed: false
      code_folding: show
---

```{r setup, include = FALSE}
knitr::opts_chunk$set(echo = TRUE, warning = FALSE, message = FALSE)

rm(list = ls())

library("tidyverse")
library("DeclareDesign")
library("modelsummary")
library("kableExtra")
```

# Overview

This is a post hoc power analysis for the manuscript "Symptoms and Stereotypes: Perceptions and Responses to COVID–19 in Malawi and Zambia,'' by Karen Ferree, et al.

This document is not for public circulation (i.e., don't email this to your reviewers).  Basically, this is my stream of consciousness thought process as I work through this problem.


# Research Design
First, I read through the main manuscript to parse out the research design.

This is a *single profile vignette* design that presents respondents in Malawi and Zambia with a hypothetical neighbor.  The neighbor's attributes and their levels are given in the table below:


|    | Age $(A_1)$   | Gender $(A_2)$    | Time $(A_3)$    | Identity $(A_4)$    | Symptoms $A_5)$    |
| :--   | :--   | :--   | :--   | :--   | :--   |
| Level 1   | 25    | Man   | Many Years    | ID 1 - Insider    | Leg   |
| Level 2   | 60    | Woman   | Few Months    | ID 2 - Outsider   | Cough + Fever   |
| Level 3   | -    |  -   | -    | ID 3 - Outsider    | High Fever    |


After being presented with the neighbor's attributes, respondents completed three choice tasks, which elicited the their:

1. Willingness to accompany the neighbor to the hospital,
2. Thoughts on whether the neighbor should be allowed to move about freely, and
3. Thoughts on whether or not the neighbor had COVID-19.

The experiment was conducted via phone, with the enumerator reading off the vignette to the respondent.

Two waves of this survey were conducted in Malawi, and a third wave was conducted in Zambia.  But for simplicity, I'm going to ignore the fact that there were three waves, and that the two Malawi waves were a partial panel. Instead, we'll conduct the analysis as if these were three independent experiments.

# Strategy

Let's say I didn't read the manuscript that carefully, and I had no idea that we were working with $\approx$ 4500 respondents in the two Malawi waves and about 2200 in Zambia.

Let's also stipulate that I read your pre-analysis plan and knew that you were going to estimate the average marginal component effects (AMCE) using an OLS specification similar to:

$$ y = \alpha + \beta_1(A_1) + \beta_2(A_2) + \beta_3(A_3) + \beta_3(A_3) + \beta_4(A_4) + \beta_5(A_5) + \epsilon  $$
Where $y$ is a binary response indicator (for simplicity, let's say "don't know," "unsure what COVID is," and "refuse to answer" are considered missing data and listwise deleted).  We thus have a relatively straightforward linear probability model.  $A_i$ is a dummy indicator for attribute $i$. Each attribute is randomized independently, and respondents have an equal probability of receiving each level of each attribute, with the exception of the **identity** variable, which, in the analysis, seems to lump two of the levels into a single "outsider" category.  We can thus reclassify the levels of identity to "insider" and "outsider," with $p(A_1 = 0) = \frac{1}{3}$ and $p(A_1 = 1) = \frac{2}{3}$.  Note that the model excludes pre-treatment covariates; it also excludes the interaction term $(A_4 \times A_5)$, which I'll address later.

What makes this particular case tricky is that this is a single profile vignette.  Most conjoint experiments follow the forced-choice model, so there's a lot more literature out there to base a power analysis on.

My strategy, then, is twofold:

1. Use Macartan et al.'s `DelcareDesign` package to specify and diagnose the design, and 
2. Use an ad hoc simulation-based approach to estimate power (which I'm still working on).

*I should note that I am currently running into issues with `DeclareDesign`'s `redesign` function. I have calls out to Jasper and Berlin to try to figure things out; hopefully they can solve these issues.  In the interim I'll be doing this longhand.

# Declare Design

`DelcareDesign` requires us to declare different features of the the design in a step by step process.

We start with a couple of preliminary steps: setting a seed for replication purposes, and and then defining a helper function that will determine the outcome of the binary choice task (i.e. "Will you take your neighbor to the hospital'' $\in$ {No = 0, Yes = 1})

```{r, preliminaries}
set.seed(37851) # Thursday, May 19, 2022, 11:33 AM CET


# Helper Function (Determines Binary Outcome) ##################################
determine_outcome <- function(utility) {
  round(pnorm(utility), 0)
}
```

Next, I'm going to define the attributes, their levels, and the associated probabilities:

```{r, attributes}
# Conjoint Attributes & Associated Probabilities ###############################
a1 <- c("25", "60")
a1_prob <- c(0.5, 0.5)

a2 <- c("Man", "Woman")
a2_prob <- c(0.5, 0.5)

a3 <- c("Few Months", "Many Years")
a3_prob <- c(0.5, 0.5)

a4 <- c("Insider", "Outsider")
a4_prob <- c(1 / 3, 2 / 3)

a5 <- c("Leg", "Cough Fever", "High Fever")
a5_prob <- c(1 / 3, 1 / 3, 1 / 3)
```

Here are the simulation parameters.  We're going to start off with a sample size of $n = 2500$, we're going to stipulate an effect size for *all* treatments at 0.025 (about 2.5 percentage points), and we're going to run the simulation 1000 times.

```{r, sim.par}
# Design Parameters ############################################################
## Observations
n_obs <-  2500

##Simulations
n_sims <- 1000

## AMCE
amce <- 0.025
```

Now we can create our hypothetical sample of $n = 2500$.  The sample will only have 2 attributes: some noise (the error term in the estimator) and a unique ID.

```{r, dec.pop}
# Create Population ############################################################
df_conj <- fabricate(N = n_obs,
                     noise = rnorm(N, sd = 1),
                     ID_label = "id")
```

The next step is to define the assignment function.  This function takes the dataset we just created above and assigns each of those respondents to a treatment condition, $A_i$, with the probabilities we defined above.  The "vignette" variable I define is just the combination of attribute levels assigned to a given respondent.

```{r, dec.assign}
# Assignment Function ##########################################################
fun_assign_resp <- function(data) {
  data %>%
    mutate(a1 = complete_ra(N = n_obs,
                            conditions = a1,
                            prob_each = a1_prob),
           a2 = complete_ra(N = n_obs,
                            conditions = a2,
                            prob_each = a2_prob),
           a3 = complete_ra(N = n_obs,
                            conditions = a3,
                            prob_each = a3_prob),
           a4 = complete_ra(N = n_obs,
                            conditions = a4,
                            prob_each = a4_prob),
           a5 = complete_ra(N = n_obs,
                            conditions = a5,
                            prob_each = a5_prob),
           vignette = paste(a1, a2, a3, a4, a5, sep = "_"))
}

assign_respondents <- declare_assignment(handler = fun_assign_resp)
```

We now need to calculate each respondent's utility, which is just the overall probability of selecting "yes" in a choice task after exposure to the vignette.  This follows the OLS specification above.  I'm also adding in the noise we defined above, and 0 as well.  Later, we can substitute in some arbitrary value for 0 to represent some baseline probability of answering "yes" to the choice task.

Note that the calculated utility for each respondent is just a function of our inputs, which does not conform to our binary outcome.  This is why we defined the helper function above---it takes our calculated utility and converts it to {0,1}.

```{r, dec.utility}
# Utility ######################################################################
fun_assign_util <- function(data) {
  data %>%
    mutate(utility = 
             qnorm(0.5 + amce) * (a1 == "60") +
             qnorm(0.5 + amce) * (a2 == "Woman") +
             qnorm(0.5 + amce) * (a3 == "Many Years") +
             qnorm(0.5 + amce) * (a4 == "Outsider") +
             qnorm(0.5 + amce) * (a5 == "Cough Fever") +
             qnorm(0.5 + amce) * (a5 == "High Fever") +
             noise +
             0) %>%
    mutate(outcome = determine_outcome(utility))
}

assign_utility <- declare_step(handler = fun_assign_util)
```

We are now ready to declare our design.  We're telling the software to use the dataset we fabricated above as our population, how to assign respondents to treatment, and how the utility is calculated.  That last step, `declare_estimator`, tells the software how we plan to estimate the the outcome.  In this case, we're using a standard linear model with robust (just like Stata!) standard errors.  We are interested in 6 terms: the coefficients for each of our dummy treatment variables, where 1 = {"60", "Woman", "Many Year", "Outsider", "Cough Fever", "High Fever"}.

```{r, dec.design}
# Declare Design ###############################################################
design <- declare_population(df_conj) +
  assign_respondents +
  assign_utility +
  declare_estimator(outcome ~ a1 + a2 + a3 + a4 + a5,
                    model = lm_robust,
                    se_type = "stata",
                    term = c("a160", "a2Woman", "a3Many Years", "a4Outsider", 
                             "a5Cough Fever", "a5High Fever"))
```

So once we have all that put together, we can take a peek at the dataset `DeclareDesign` assembled for us:

```{r, peek}
# Look at a Single Simulation of Input Data ####################################
q <- design %>%
  draw_data()

q %>%
  head(10) %>% 
  kable() %>% 
  kable_styling()
```

## Simulation 1

Now we can diagnose our initial design, using the `diagnose_design` call. Recall that $n = 2500$, and that we stipulated an AMCE of 0.025 for all attributes.  The plot below shows the density for the AMCE estimates of each term; the red line is the "true" AMCE.

```{r, d.1.a, fig.width = 10}
# Diagnosis & Simulation #######################################################
diagnosis_1 <- diagnose_design(design, sims = n_sims)

get_simulations(diagnosis_1) %>%
  ggplot(aes(estimate)) +
  geom_density() +
  facet_grid(~ term) +
  geom_vline(xintercept = amce, col = "red", linetype = "dashed")
```

Of course, we are interested in power, which was calculated for us by `DeclareDesign`. The columns in the table below provide the term, the mean AMCE estimate and its standard error, and the estimated power and its standard error.

```{r, d.1.b}
diagnosis_1$diagnosands_df %>%
  select(term, mean_estimate, `se(mean_estimate)`, power, `se(power)`) %>%
  kable(digits = 3) %>%
  kable_styling()
```

The mean AMCE esitmates are all pretty close to the "true" AMCE.  Power is obviously not great given these particular parameters---it's a small effect size and a relatively small sample size.

Another way to visualize power is to look at a histogram of significant and not-significant AMCE estimates for each term.  Below, power is represented as the ratio of the blue bars to the red bars.  I also note the power in title of each cell (sorry it's small and hard to read).

```{r, d.1.c, fig.width = 10}
x <- diagnosis_1$simulations_df %>%
  mutate(sig = ifelse(p.value <= 0.05, "Yes", "No"))

x <- left_join(x, diagnosis_1$diagnosands_df) %>%
  mutate(lbl = paste0(term, " (pwr = ", power, ", n = ", n_obs, ")"))

x %>% 
  ggplot(aes(estimate, fill = sig)) +
  geom_histogram(position = "identity", alpha = 0.5) +
  facet_grid(~ lbl) +
  theme(strip.text.x = element_text(size = 5))
```

Note that power for the two symptom condition is quite a bit lower than power for the binary conditions assigned with equal probability.  This provides a nice reality check; power should be lower in these two conditions given the smaller $n$ assigned each level. This suggests the simulation is returning reasonable estimates.

## Simulation 2

Let's now run the same simulation, but change the AMCE to 0.05.  We'll keep the sample size the same.


```{r, d.2}
amce <- 0.05

fun_assign_util <- function(data) {
  data %>%
    mutate(utility = 
             qnorm(0.5 + amce) * (a1 == "60") +
             qnorm(0.5 + amce) * (a2 == "Woman") +
             qnorm(0.5 + amce) * (a3 == "Many Years") +
             qnorm(0.5 + amce) * (a4 == "Outsider") +
             qnorm(0.5 + amce) * (a5 == "Cough Fever") +
             qnorm(0.5 + amce) * (a5 == "High Fever") +
             noise + 
             0) %>%
    mutate(outcome = determine_outcome(utility))
}

assign_utility <- declare_step(handler = fun_assign_util)

design <- declare_population(df_conj) +
  assign_respondents +
  assign_utility +
  declare_estimator(outcome ~ a1 + a2 + a3 + a4 + a5,
                    model = lm_robust,
                    se_type = "stata",
                    term = c("a160", "a2Woman", "a3Many Years", "a4Outsider", 
                             "a5Cough Fever", "a5High Fever"))

diagnosis_2 <- diagnose_design(design, sims = n_sims)

diagnosis_2$diagnosands_df %>%
  select(term, mean_estimate, `se(mean_estimate)`, power, `se(power)`) %>%
  kable(digits = 3) %>%
  kable_styling()
```

These results are a bit more biased, in that the average AMCE estimate minus the "true" AMCE is quite large.

Let's take a look at the distributions of the estimated AMCEs:

```{r, d.2.a, fig.width = 10}
x <- diagnosis_2$simulations_df %>%
  mutate(sig = ifelse(p.value <= 0.05, "Yes", "No"))

x <- left_join(x, diagnosis_2$diagnosands_df) %>%
  mutate(lbl = paste0(term, " (pwr = ", power, ", n = ", n_obs, ")"))

x %>% 
  ggplot(aes(estimate, fill = sig)) +
  geom_histogram(position = "identity", alpha = 0.5) +
  facet_grid(~ lbl) +
  theme(strip.text.x = element_text(size = 5)) + 
  theme(axis.text.x = element_text(angle = 45, hjust = 1)) +
  geom_vline(xintercept = amce, col = "red", linetype = "dashed")
```

## Simulation 3

What Happens if we increase our sample size to $n = 4000$, but keep this same AMCE of 0.05?

```{r, d.3}
n_obs <-  4000

df_conj <- fabricate(N = n_obs,
                     noise = rnorm(N, sd = 1),
                     ID_label = "id")

design <- declare_population(df_conj) +
  assign_respondents +
  assign_utility +
  declare_estimator(outcome ~ a1 + a2 + a3 + a4 + a5,
                    model = lm_robust,
                    se_type = "stata",
                    term = c("a160", "a2Woman", "a3Many Years", "a4Outsider", 
                             "a5Cough Fever", "a5High Fever"))

diagnosis_3 <- diagnose_design(design, sims = n_sims)

diagnosis_3$diagnosands_df %>%
  select(term, mean_estimate, `se(mean_estimate)`, power, `se(power)`) %>%
  kable(digits = 3) %>%
  kable_styling()
```

Power increases substantially.  Looks like you have reasonable power to detect a 5 point effect on all of the binary attributes assigned with equal probability, though less so for "outsider" and the two symptoms treatments.

## Simulation 4

Let's do an actual actual power analysis for one of the Malawi waves.  In Wave 2, you had a sample size of $n \approx 4300$.  We'll specify the following AMCEs, which seem to be consistent with the "Has Covid" outcome in Table A1 in the Appendix:

* 60 Years Old: -0.01
* Female: -0.02
* Many Years: 0.03
* Outsider: 0.05
* Cough and Fever: 0.35
* High Fever: 0.25

Note the sign switch for "many years," as you use a "many years" as the base level in the analysis, whil I'm using "few months."

```{r malawi.sim1, fig.width = 10, cache = TRUE}
n_obs <-  4300


df_conj <- fabricate(N = n_obs,
                     noise = rnorm(N, sd = 1),
                     ID_label = "id")

fun_assign_util <- function(data) {
  data %>%
    mutate(utility = 
             qnorm(0.5 - 0.01) * (a1 == "60") +               # Manually add AMCEs
             qnorm(0.5 - 0.02) * (a2 == "Woman") +
             qnorm(0.5 - 0.03) * (a3 == "Many Years") +
             qnorm(0.5 + 0.05) * (a4 == "Outsider") +
             qnorm(0.5 + 0.35) * (a5 == "Cough Fever") +
             qnorm(0.5 + 0.25) * (a5 == "High Fever") +
             noise + 
             0) %>%
    mutate(outcome = determine_outcome(utility))
}

assign_utility <- declare_step(handler = fun_assign_util)


design <- declare_population(df_conj) +
  assign_respondents +
  assign_utility +
  declare_estimator(outcome ~ a1 + a2 + a3 + a4 + a5,
                    model = lm_robust,
                    se_type = "stata",
                    term = c("a160", "a2Woman", "a3Many Years", "a4Outsider", 
                             "a5Cough Fever", "a5High Fever"))

diagnosis_4 <- diagnose_design(design, sims = n_sims)

diagnosis_4$diagnosands_df %>%
  select(term, mean_estimate, `se(mean_estimate)`, power, `se(power)`) %>%
  kable(digits = 3) %>%
  kable_styling()

get_simulations(diagnosis_4) %>%
  ggplot(aes(estimate)) +
  geom_density() +
  facet_grid(~ term, scales = "free") +
  theme(axis.text.x = element_text(angle = 45, hjust = 1)) +
  geom_vline(data = filter(get_simulations(diagnosis_4), term == "a160"), 
             aes(xintercept = -0.01), 
             col = "red", linetype = "dashed") +
  geom_vline(data = filter(get_simulations(diagnosis_4), term == "a2Woman"), 
             aes(xintercept = -0.02), 
             col = "red", linetype = "dashed") +
  geom_vline(data = filter(get_simulations(diagnosis_4), term == "a3Many Years"), 
             aes(xintercept = -0.03), 
             col = "red", linetype = "dashed") +
  geom_vline(data = filter(get_simulations(diagnosis_4), term == "a4Outsider"), 
             aes(xintercept = 0.05), 
             col = "red", linetype = "dashed") +
  geom_vline(data = filter(get_simulations(diagnosis_4), term == "a5Cough Fever"), 
             aes(xintercept = 0.35), 
             col = "red", linetype = "dashed") +
  geom_vline(data = filter(get_simulations(diagnosis_4), term == "a5High Fever"), 
             aes(xintercept = 0.25), 
             col = "red", linetype = "dashed")

x <- diagnosis_4$simulations_df %>%
  mutate(sig = ifelse(p.value <= 0.05, "Yes", "No"))

x <- left_join(x, diagnosis_4$diagnosands_df) %>%
  mutate(lbl = paste0(term, " (pwr = ", power, ", n = ", n_obs, ")"))

x %>% 
  ggplot(aes(estimate, fill = sig)) +
  geom_histogram(position = "identity", alpha = 0.5) +
  facet_grid(~ lbl, scales = "free") +
  theme(strip.text.x = element_text(size = 5)) +
  theme(axis.text.x = element_text(angle = 45, hjust = 1))
```

This simulation jives with my priors; the relatively large effects on the two symptom conditions should be easier to detect than the relatively small effects on age and gender.

## Simulation 5

What about the interaction?  Let's re-run the previous simulation, but change the estimator to include an interaction term.

```{r malawi.sim2, fig.width = 10, cache = TRUE}
n_obs <-  4300


df_conj <- fabricate(N = n_obs,
                     noise = rnorm(N, sd = 1),
                     ID_label = "id")

fun_assign_util <- function(data) {
  data %>%
    mutate(utility = 
             qnorm(0.5 - 0.01) * (a1 == "60") +               # Manually add AMCEs
             qnorm(0.5 - 0.02) * (a2 == "Woman") +
             qnorm(0.5 - 0.03) * (a3 == "Many Years") +
             qnorm(0.5 + 0.05) * (a4 == "Outsider") +
             qnorm(0.5 + 0.35) * (a5 == "Cough Fever") +
             qnorm(0.5 + 0.25) * (a5 == "High Fever") +
             qnorm(0.5 + 0.05) * (a4 == "Outsider" & a5 == "High Fever") +
             qnorm(0.5 + 0.01) * (a4 == "Outsider" & a5 == "Cough Fever") +
             noise + 
             0) %>%
    mutate(outcome = determine_outcome(utility))
}

assign_utility <- declare_step(handler = fun_assign_util)


design <- declare_population(df_conj) +
  assign_respondents +
  assign_utility +
  declare_estimator(outcome ~ a1 + a2 + a3 + a4 * a5,
                    model = lm_robust,
                    se_type = "stata",
                    term = c("a160", "a2Woman", "a3Many Years", "a4Outsider", 
                             "a4Outsider:a5Cough Fever", "a4Outsider:a5High Fever",
                             "a5Cough Fever", "a5High Fever"))

diagnosis_5 <- diagnose_design(design, sims = n_sims)

diagnosis_5$diagnosands_df %>%
  select(term, mean_estimate, `se(mean_estimate)`, power, `se(power)`) %>%
  kable(digits = 3) %>%
  kable_styling()

get_simulations(diagnosis_5) %>%
  ggplot(aes(estimate)) +
  geom_density() +
  facet_grid(~ term, scales = "free") +
  theme(axis.text.x = element_text(angle = 45, hjust = 1))

x <- diagnosis_5$simulations_df %>%
  mutate(sig = ifelse(p.value <= 0.05, "Yes", "No"))

x <- left_join(x, diagnosis_5$diagnosands_df) %>%
  mutate(lbl = paste0(term, " (pwr = ", power, ", n = ", n_obs, ")"))

x %>% 
  ggplot(aes(estimate, fill = sig)) +
  geom_histogram(position = "identity", alpha = 0.5) +
  facet_grid(~ lbl, scales = "free") +
  theme(strip.text.x = element_text(size = 5)) +
  theme(axis.text.x = element_text(angle = 45, hjust = 1))
```

## Simulation 6

Now let's look at a pooled sample.  What happens if we stipulate the same effect sizes as  Simulation 4, but increased the sample size to $n_\mathrm{Malawi1} + n_\mathrm{Malawi2} + n_\mathrm{Zambia} = 4600 + 4300 + 2200 = 11100$.

```{r pooled.sim1, fig.width = 10, cache = TRUE}
n_obs <-  11100


df_conj <- fabricate(N = n_obs,
                     noise = rnorm(N, sd = 1),
                     ID_label = "id")

fun_assign_util <- function(data) {
  data %>%
    mutate(utility = 
             qnorm(0.5 - 0.01) * (a1 == "60") +               # Manually add AMCEs
             qnorm(0.5 - 0.02) * (a2 == "Woman") +
             qnorm(0.5 - 0.03) * (a3 == "Many Years") +
             qnorm(0.5 + 0.05) * (a4 == "Outsider") +
             qnorm(0.5 + 0.35) * (a5 == "Cough Fever") +
             qnorm(0.5 + 0.25) * (a5 == "High Fever") +
             noise + 
             0) %>%
    mutate(outcome = determine_outcome(utility))
}

assign_utility <- declare_step(handler = fun_assign_util)


design <- declare_population(df_conj) +
  assign_respondents +
  assign_utility +
  declare_estimator(outcome ~ a1 + a2 + a3 + a4 + a5,
                    model = lm_robust,
                    se_type = "stata",
                    term = c("a160", "a2Woman", "a3Many Years", "a4Outsider", 
                             "a5Cough Fever", "a5High Fever"))

diagnosis_6 <- diagnose_design(design, sims = n_sims)

diagnosis_6$diagnosands_df %>%
  select(term, mean_estimate, `se(mean_estimate)`, power, `se(power)`) %>%
  kable(digits = 3) %>%
  kable_styling()

get_simulations(diagnosis_6) %>%
  ggplot(aes(estimate)) +
  geom_density() +
  facet_grid(~ term, scales = "free") +
  theme(axis.text.x = element_text(angle = 45, hjust = 1)) +
  geom_vline(data = filter(get_simulations(diagnosis_4), term == "a160"), 
             aes(xintercept = -0.01), 
             col = "red", linetype = "dashed") +
  geom_vline(data = filter(get_simulations(diagnosis_4), term == "a2Woman"), 
             aes(xintercept = -0.02), 
             col = "red", linetype = "dashed") +
  geom_vline(data = filter(get_simulations(diagnosis_4), term == "a3Many Years"), 
             aes(xintercept = -0.03), 
             col = "red", linetype = "dashed") +
  geom_vline(data = filter(get_simulations(diagnosis_4), term == "a4Outsider"), 
             aes(xintercept = 0.05), 
             col = "red", linetype = "dashed") +
  geom_vline(data = filter(get_simulations(diagnosis_4), term == "a5Cough Fever"), 
             aes(xintercept = 0.35), 
             col = "red", linetype = "dashed") +
  geom_vline(data = filter(get_simulations(diagnosis_4), term == "a5High Fever"), 
             aes(xintercept = 0.25), 
             col = "red", linetype = "dashed")

x <- diagnosis_6$simulations_df %>%
  mutate(sig = ifelse(p.value <= 0.05, "Yes", "No"))

x <- left_join(x, diagnosis_6$diagnosands_df) %>%
  mutate(lbl = paste0(term, " (pwr = ", power, ", n = ", n_obs, ")"))

x %>% 
  ggplot(aes(estimate, fill = sig)) +
  geom_histogram(position = "identity", alpha = 0.5) +
  facet_grid(~ lbl, scales = "free") +
  theme(strip.text.x = element_text(size = 5)) +
  theme(axis.text.x = element_text(angle = 45, hjust = 1))
```

Here, age and gender are such small effects that you're still short on power, though there does seem to be sufficient power to uncover a 3 point effect on the outsider attribute.


## Simulations 7

Let's also simulate power for the interaction effects, using the pooled $n$.

```{r pooled.sim2, fig.width = 10, cache = TRUE}
n_obs <-  11100


df_conj <- fabricate(N = n_obs,
                     noise = rnorm(N, sd = 1),
                     ID_label = "id")

fun_assign_util <- function(data) {
  data %>%
    mutate(utility = 
             qnorm(0.5 - 0.01) * (a1 == "60") +               # Manually add AMCEs
             qnorm(0.5 - 0.02) * (a2 == "Woman") +
             qnorm(0.5 - 0.03) * (a3 == "Many Years") +
             qnorm(0.5 + 0.05) * (a4 == "Outsider") +
             qnorm(0.5 + 0.35) * (a5 == "Cough Fever") +
             qnorm(0.5 + 0.25) * (a5 == "High Fever") +
             qnorm(0.5 + 0.05) * (a4 == "Outsider" & a5 == "High Fever") +
             qnorm(0.5 + 0.01) * (a4 == "Outsider" & a5 == "Cough Fever") +
             noise + 
             0) %>%
    mutate(outcome = determine_outcome(utility))
}

assign_utility <- declare_step(handler = fun_assign_util)


design <- declare_population(df_conj) +
  assign_respondents +
  assign_utility +
  declare_estimator(outcome ~ a1 + a2 + a3 + a4 * a5,
                    model = lm_robust,
                    se_type = "stata",
                    term = c("a160", "a2Woman", "a3Many Years", "a4Outsider", 
                             "a4Outsider:a5Cough Fever", "a4Outsider:a5High Fever",
                             "a5Cough Fever", "a5High Fever"))

diagnosis_7 <- diagnose_design(design, sims = n_sims)

diagnosis_7$diagnosands_df %>%
  select(term, mean_estimate, `se(mean_estimate)`, power, `se(power)`) %>%
  kable(digits = 3) %>%
  kable_styling()

get_simulations(diagnosis_7) %>%
  ggplot(aes(estimate)) +
  geom_density() +
  facet_grid(~ term, scales = "free") +
  theme(axis.text.x = element_text(angle = 45, hjust = 1))

x <- diagnosis_7$simulations_df %>%
  mutate(sig = ifelse(p.value <= 0.05, "Yes", "No"))

x <- left_join(x, diagnosis_7$diagnosands_df) %>%
  mutate(lbl = paste0(term, " (pwr = ", power, ", n = ", n_obs, ")"))

x %>% 
  ggplot(aes(estimate, fill = sig)) +
  geom_histogram(position = "identity", alpha = 0.5) +
  facet_grid(~ lbl, scales = "free") +
  theme(strip.text.x = element_text(size = 5)) +
  theme(axis.text.x = element_text(angle = 45, hjust = 1))
```

You're still under-powered for the interaction terms, gender, and age.

## Power Curves 1

Let's plot some traditional power curves.  As I mentioned above, I'm having issues with the package's (apparently) experimental `resdeign` function, so I'm going to do this in the most inefficient way possible: I'm going to throw everything into a for loop.  Sorry.

For this set of curves, let's stipulate an AMCE of 0.05 for all attributes, just to set the baseline. We'll calculate power for sample sizes ranging from 2000 to 4500 in intervals of 500.

```{r, power.1, fig.width = 10, cache = TRUE}
rm(list = ls())

# Helper Function (Determines Binary Outcome) ##################################
determine_outcome <- function(utility) {
  round(pnorm(utility), 0)
}

# Conjoint Attributes & Associated Probabilities ###############################
a1 <- c("25", "60")
a1_prob <- c(0.5, 0.5)

a2 <- c("Man", "Woman")
a2_prob <- c(0.5, 0.5)

a3 <- c("Few Months", "Many Years")
a3_prob <- c(0.5, 0.5)

a4 <- c("Insider", "Outsider")
a4_prob <- c(1 / 3, 2 / 3)

a5 <- c("Leg", "Cough Fever", "High Fever")
a5_prob <- c(1 / 3, 1 / 3, 1 / 3)


# Design Parameters ############################################################
##Simulations
n_sims <- 500

## AMCE
amce <- 0.05

## Empty DF
results <- NULL


# Sample Sizes #################################################################
n_obs <- list(2000, 2500, 3000, 3500, 4000, 4500)


# The Loop #####################################################################

for (i in n_obs) {
  
  df_conj <- fabricate(N = i,
                       noise = rnorm(N, sd = 1),
                       ID_label = "id")
  
  fun_assign_resp <- function(data) {
    data %>%
      mutate(a1 = complete_ra(N = i,
                              conditions = a1,
                              prob_each = a1_prob),
             a2 = complete_ra(N = i,
                              conditions = a2,
                              prob_each = a2_prob),
             a3 = complete_ra(N = i,
                              conditions = a3,
                              prob_each = a3_prob),
             a4 = complete_ra(N = i,
                              conditions = a4,
                              prob_each = a4_prob),
             a5 = complete_ra(N = i,
                              conditions = a5,
                              prob_each = a5_prob),
             vignette = paste(a1, a2, a3, a4, a5, sep = "_"))
  }
  
  assign_respondents <- declare_assignment(handler = fun_assign_resp)
  
  fun_assign_util <- function(data) {
    data %>%
      mutate(utility = 
               qnorm(0.5 + amce) * (a1 == "60") +
               qnorm(0.5 + amce) * (a2 == "Woman") +
               qnorm(0.5 + amce) * (a3 == "Many Years") +
               qnorm(0.5 + amce) * (a4 == "Outsider") +
               qnorm(0.5 + amce) * (a5 == "Cough Fever") +
               qnorm(0.5 + amce) * (a5 == "High Fever") +
               noise + 0.25) %>%
      mutate(outcome = determine_outcome(utility))
  }
  
  assign_utility <- declare_step(handler = fun_assign_util)
  
  design <- declare_population(df_conj) +
    assign_respondents +
    assign_utility +
    declare_estimator(outcome ~ a1 + a2 + a3 + a4 + a5,
                      model = lm_robust,
                      se_type = "stata",
                      term = c("a160", "a2Woman", "a3Many Years", "a4Outsider", 
                               "a5Cough Fever", "a5High Fever"))
  
  diagnosis <- diagnose_design(design, sims = n_sims)
  
  x <- diagnosis$diagnosands_df
  x$observations <- i
  
  results <- rbind(results, x)
}


results %>%
  ggplot(aes(observations, power)) +
  geom_ribbon(aes(ymin = (power - (1.96 * `se(power)`)), 
                  ymax = (power + (1.96 * `se(power)`))), 
              alpha = 0.1) +
  facet_grid(~ term) +
  geom_line() +
  geom_point() +
  geom_hline(yintercept = 0.8, col = "red", linetype = "dashed") +
  theme(axis.text.x = element_text(angle = 45, hjust = 1)) +
  labs(title = "Power Curves by Attribute for an AMCE of 0.05")
```


## Power Curves 2

Now let's estimate some power curves for your actual estimated effect sizes, as in Simulation 4:

* 60 Years Old: -0.01
* Female: -0.02
* Many Years: 0.03
* Outsider: 0.05
* Cough and Fever: 0.35
* High Fever: 0.25

```{r, power.2, fig.width = 10, cache = TRUE}
rm(list = ls())


# Helper Function (Determines Binary Outcome) ##################################
determine_outcome <- function(utility) {
  round(pnorm(utility), 0)
}

# Conjoint Attributes & Associated Probabilities ###############################
a1 <- c("25", "60")
a1_prob <- c(0.5, 0.5)

a2 <- c("Man", "Woman")
a2_prob <- c(0.5, 0.5)

a3 <- c("Few Months", "Many Years")
a3_prob <- c(0.5, 0.5)

a4 <- c("Insider", "Outsider")
a4_prob <- c(1 / 3, 2 / 3)

a5 <- c("Leg", "Cough Fever", "High Fever")
a5_prob <- c(1 / 3, 1 / 3, 1 / 3)


# Design Parameters ############################################################
##Simulations
n_sims <- 500

## Empty DF
results <- NULL


# Sample Sizes #################################################################
n_obs <- list(2000, 2500, 3000, 3500, 4000, 4500)


# The Loop #####################################################################

for (i in n_obs) {
  
  df_conj <- fabricate(N = i,
                       noise = rnorm(N, sd = 1),
                       ID_label = "id")
  
  fun_assign_resp <- function(data) {
    data %>%
      mutate(a1 = complete_ra(N = i,
                              conditions = a1,
                              prob_each = a1_prob),
             a2 = complete_ra(N = i,
                              conditions = a2,
                              prob_each = a2_prob),
             a3 = complete_ra(N = i,
                              conditions = a3,
                              prob_each = a3_prob),
             a4 = complete_ra(N = i,
                              conditions = a4,
                              prob_each = a4_prob),
             a5 = complete_ra(N = i,
                              conditions = a5,
                              prob_each = a5_prob),
             vignette = paste(a1, a2, a3, a4, a5, sep = "_"))
  }
  
  assign_respondents <- declare_assignment(handler = fun_assign_resp)
  
  fun_assign_util <- function(data) {
    data %>%
      mutate(utility = 
               qnorm(0.5 - 0.01) * (a1 == "60") +               # Manually add AMCEs
               qnorm(0.5 - 0.02) * (a2 == "Woman") +
               qnorm(0.5 - 0.03) * (a3 == "Many Years") +
               qnorm(0.5 + 0.05) * (a4 == "Outsider") +
               qnorm(0.5 + 0.35) * (a5 == "Cough Fever") +
               qnorm(0.5 + 0.25) * (a5 == "High Fever") +
               noise + 0.25) %>%
      mutate(outcome = determine_outcome(utility))
  }
  
  assign_utility <- declare_step(handler = fun_assign_util)
  
  design <- declare_population(df_conj) +
    assign_respondents +
    assign_utility +
    declare_estimator(outcome ~ a1 + a2 + a3 + a4 + a5,
                      model = lm_robust,
                      se_type = "stata",
                      term = c("a160", "a2Woman", "a3Many Years", "a4Outsider", 
                               "a5Cough Fever", "a5High Fever"))
  
  diagnosis <- diagnose_design(design, sims = n_sims)
  
  x <- diagnosis$diagnosands_df
  x$observations <- i
  
  results <- rbind(results, x)
}


results %>%
  ggplot(aes(observations, power)) +
  geom_ribbon(aes(ymin = (power - (1.96 * `se(power)`)), 
                  ymax = (power + (1.96 * `se(power)`))), 
              alpha = 0.1) +
  facet_grid(~ term) +
  geom_line() +
  geom_point() +
  geom_hline(yintercept = 0.8, col = "red", linetype = "dashed") +
  theme(axis.text.x = element_text(angle = 45, hjust = 1)) +
  labs(title = "Power Curves by Attribute")
```

## Power Curves 3

OK, let's say that you wanted to figure out the smallest detectable effect for your Zambia sample ($n \approx 2000$).  In this simulation, we'll let the AMCE for age vary between between -0.02 and 0.12, and we'll fix the other AMCEs at:

* Female: -0.02
* Many Years: 0.03
* Outsider: 0.05
* Cough and Fever: 0.35
* High Fever: 0.25

This should give a ballpark for the smallest detectable effect for one of your binary treatments assigned with equal probability.


```{r, power.3, fig.width = 10, cache = TRUE}
rm(list = ls())

# Helper Function (Determines Binary Outcome) ##################################
determine_outcome <- function(utility) {
  round(pnorm(utility), 0)
}

# Conjoint Attributes & Associated Probabilities ###############################
a1 <- c("25", "60")
a1_prob <- c(0.5, 0.5)

a2 <- c("Man", "Woman")
a2_prob <- c(0.5, 0.5)

a3 <- c("Few Months", "Many Years")
a3_prob <- c(0.5, 0.5)

a4 <- c("Insider", "Outsider")
a4_prob <- c(1 / 3, 2 / 3)

a5 <- c("Leg", "Cough Fever", "High Fever")
a5_prob <- c(1 / 3, 1 / 3, 1 / 3)


# Design Parameters ############################################################
##Simulations
n_sims <- 750

## Empty DF
results <- NULL

## Observations
n_obs = 2000


# Effect Sizes #################################################################
amce <- seq(-0.02, 0.12, 0.01)


# The Loop #####################################################################

for (i in amce) {
  
  df_conj <- fabricate(N = n_obs,
                       noise = rnorm(N, sd = 1),
                       ID_label = "id")
  
  fun_assign_resp <- function(data) {
    data %>%
      mutate(a1 = complete_ra(N = n_obs,
                              conditions = a1,
                              prob_each = a1_prob),
             a2 = complete_ra(N = n_obs,
                              conditions = a2,
                              prob_each = a2_prob),
             a3 = complete_ra(N = n_obs,
                              conditions = a3,
                              prob_each = a3_prob),
             a4 = complete_ra(N = n_obs,
                              conditions = a4,
                              prob_each = a4_prob),
             a5 = complete_ra(N = n_obs,
                              conditions = a5,
                              prob_each = a5_prob),
             vignette = paste(a1, a2, a3, a4, a5, sep = "_"))
  }
  
  assign_respondents <- declare_assignment(handler = fun_assign_resp)
  
  fun_assign_util <- function(data) {
    data %>%
      mutate(utility = 
               qnorm(0.5 + i) * (a1 == "60") +
               qnorm(0.5 - 0.02) * (a2 == "Woman") +
               qnorm(0.5 - 0.03) * (a3 == "Many Years") +
               qnorm(0.5 + 0.05) * (a4 == "Outsider") +
               qnorm(0.5 + 0.35) * (a5 == "Cough Fever") +
               qnorm(0.5 + 0.25) * (a5 == "High Fever") +
               noise + 
               0) %>%
      mutate(outcome = determine_outcome(utility))
  }
  
  assign_utility <- declare_step(handler = fun_assign_util)
  
  design <- declare_population(df_conj) +
    assign_respondents +
    assign_utility +
    declare_estimator(outcome ~ a1 + a2 + a3 + a4 + a5,
                      model = lm_robust,
                      se_type = "stata",
                      term = c("a160", "a2Woman", "a3Many Years", "a4Outsider", 
                               "a5Cough Fever", "a5High Fever"))
  
  diagnosis <- diagnose_design(design, sims = n_sims)
  
  x <- diagnosis$diagnosands_df
  x$amce <- i
  
  results <- rbind(results, x)
}


results %>%
  subset(term == "a160") %>%
  ggplot(aes(amce, power)) +
  geom_ribbon(aes(ymin = (power - (1.96 * `se(power)`)), 
                  ymax = (power + (1.96 * `se(power)`))), 
              alpha = 0.1) +
  geom_line() +
  geom_point() +
  geom_hline(yintercept = 0.8, col = "red", linetype = "dashed") +
  labs(title = "Power Curve by AMCE",
       subtitle = "Sample Size n = 2000") +
  scale_x_continuous(breaks = seq(-.02, 0.12, 0.01))
```

It looks like you don't have sufficient power to detect an AMCE < 7 points or so.  This jives with the findings above.

## Power Curves 4

Let's repeat this, but also vary the sample size to cover Zambia, the Malawi Waves, and the pooled sample $n = \{2000, 4200, 11100\}$.  We'll stipulate the same AMCEs, but let one of the binary attributes (age) vary.

```{r, power.4, fig.width = 10, cache = TRUE}
rm(list = ls())

# Helper Function (Determines Binary Outcome) ##################################
determine_outcome <- function(utility) {
  round(pnorm(utility), 0)
}

# Conjoint Attributes & Associated Probabilities ###############################
a1 <- c("25", "60")
a1_prob <- c(0.5, 0.5)

a2 <- c("Man", "Woman")
a2_prob <- c(0.5, 0.5)

a3 <- c("Few Months", "Many Years")
a3_prob <- c(0.5, 0.5)

a4 <- c("Insider", "Outsider")
a4_prob <- c(1 / 3, 2 / 3)

a5 <- c("Leg", "Cough Fever", "High Fever")
a5_prob <- c(1 / 3, 1 / 3, 1 / 3)


# Design Parameters ############################################################
##Simulations
n_sims <- 500

## Empty DF
results <- NULL



# Sample & Effect Sizes ########################################################
amce <- seq(-0.02, 0.12, 0.01)

n_obs <- list(2000, 4200, 11000)


# The Loop #####################################################################

for (i in amce) {
  for (j in n_obs) {
    
    df_conj <- fabricate(N = j,
                         noise = rnorm(N, sd = 1),
                         ID_label = "id")
    
    fun_assign_resp <- function(data) {
      data %>%
        mutate(a1 = complete_ra(N = j,
                                conditions = a1,
                                prob_each = a1_prob),
               a2 = complete_ra(N = j,
                                conditions = a2,
                                prob_each = a2_prob),
               a3 = complete_ra(N = j,
                                conditions = a3,
                                prob_each = a3_prob),
               a4 = complete_ra(N = j,
                                conditions = a4,
                                prob_each = a4_prob),
               a5 = complete_ra(N = j,
                                conditions = a5,
                                prob_each = a5_prob),
               vignette = paste(a1, a2, a3, a4, a5, sep = "_"))
    }
    
    assign_respondents <- declare_assignment(handler = fun_assign_resp)
    
    fun_assign_util <- function(data) {
      data %>%
        mutate(utility = 
                 qnorm(0.5 + i) * (a1 == "60") +
                 qnorm(0.5 - 0.02) * (a2 == "Woman") +
                 qnorm(0.5 - 0.03) * (a3 == "Many Years") +
                 qnorm(0.5 + 0.05) * (a4 == "Outsider") +
                 qnorm(0.5 + 0.35) * (a5 == "Cough Fever") +
                 qnorm(0.5 + 0.25) * (a5 == "High Fever") +
                 noise + 
                 0) %>%
        mutate(outcome = determine_outcome(utility))
    }
    
    assign_utility <- declare_step(handler = fun_assign_util)
    
    design <- declare_population(df_conj) +
      assign_respondents +
      assign_utility +
      declare_estimator(outcome ~ a1 + a2 + a3 + a4 + a5,
                        model = lm_robust,
                        se_type = "stata",
                        term = c("a160", "a2Woman", "a3Many Years", "a4Outsider", 
                                 "a5Cough Fever", "a5High Fever"))
    
    diagnosis <- diagnose_design(design, sims = n_sims)
    
    x <- diagnosis$diagnosands_df
    x$amce <- i
    x$observations <- j
    
    results <- rbind(results, x)
  }
}

results %>%
  subset(term == "a160") %>%
  ggplot(aes(amce, power)) +
  geom_ribbon(aes(ymin = (power - (1.96 * `se(power)`)), 
                  ymax = (power + (1.96 * `se(power)`))), 
              alpha = 0.1) +
  geom_line() +
  geom_point() +
  facet_grid(~ as.factor(observations)) +
  geom_hline(yintercept = 0.8, col = "red", linetype = "dashed") +
  labs(title = "Power Curve by AMCE and Sample Size",
       subtitle = "Sample Size n = 2000") +
  scale_x_continuous(breaks = seq(-.02, 0.12, 0.01)) +
  theme(axis.text.x = element_text(angle = 45, hjust = 1))
```


## Power Curves 5

Let's now re-run the same simulation, but vary the two interaction effects over the same range.

```{r, power.5, fig.width = 10, cache = TRUE}
rm(list = ls())

library(DeclareDesign)
library(tidyverse)

# Helper Function (Determines Binary Outcome) ##################################
determine_outcome <- function(utility) {
  round(pnorm(utility), 0)
}

# Conjoint Attributes & Associated Probabilities ###############################
a1 <- c("25", "60")
a1_prob <- c(0.5, 0.5)

a2 <- c("Man", "Woman")
a2_prob <- c(0.5, 0.5)

a3 <- c("Few Months", "Many Years")
a3_prob <- c(0.5, 0.5)

a4 <- c("Insider", "Outsider")
a4_prob <- c(1 / 3, 2 / 3)

a5 <- c("Leg", "Cough Fever", "High Fever")
a5_prob <- c(1 / 3, 1 / 3, 1 / 3)


# Design Parameters ############################################################
##Simulations
n_sims <- 500

## Empty DF
results <- NULL



# Sample & Effect Sizes ########################################################
amce <- seq(-0.02, 0.12, 0.01)

n_obs <- list(2000, 4200, 11000)


# The Loop #####################################################################

for (i in amce) {
  for (j in n_obs) {
    
    df_conj <- fabricate(N = j,
                         noise = rnorm(N, sd = 1),
                         ID_label = "id")
    
    fun_assign_resp <- function(data) {
      data %>%
        mutate(a1 = complete_ra(N = j,
                                conditions = a1,
                                prob_each = a1_prob),
               a2 = complete_ra(N = j,
                                conditions = a2,
                                prob_each = a2_prob),
               a3 = complete_ra(N = j,
                                conditions = a3,
                                prob_each = a3_prob),
               a4 = complete_ra(N = j,
                                conditions = a4,
                                prob_each = a4_prob),
               a5 = complete_ra(N = j,
                                conditions = a5,
                                prob_each = a5_prob),
               vignette = paste(a1, a2, a3, a4, a5, sep = "_"))
    }
    
    assign_respondents <- declare_assignment(handler = fun_assign_resp)
    
    fun_assign_util <- function(data) {
      data %>%
        mutate(utility = 
                 qnorm(0.5 - 0.01) * (a1 == "60") +               # Manually add AMCEs
                 qnorm(0.5 - 0.02) * (a2 == "Woman") +
                 qnorm(0.5 - 0.03) * (a3 == "Many Years") +
                 qnorm(0.5 + 0.05) * (a4 == "Outsider") +
                 qnorm(0.5 + 0.35) * (a5 == "Cough Fever") +
                 qnorm(0.5 + 0.25) * (a5 == "High Fever") +
                 qnorm(0.5 + i) * (a4 == "Outsider" & a5 == "High Fever") +
                 qnorm(0.5 + i) * (a4 == "Outsider" & a5 == "Cough Fever") +
                 noise + 
                 0) %>%
        mutate(outcome = determine_outcome(utility))
    }
    
    assign_utility <- declare_step(handler = fun_assign_util)
    
    design <- declare_population(df_conj) +
      assign_respondents +
      assign_utility +
      declare_estimator(outcome ~ a1 + a2 + a3 + a4 * a5,
                        model = lm_robust,
                        se_type = "stata",
                        term = TRUE)
    
    diagnosis <- diagnose_design(design, sims = n_sims)
    
    x <- diagnosis$diagnosands_df
    x$amce <- i
    x$observations <- j
    
    results <- rbind(results, x)
  }
}

results %>%
  subset(term == "a4Outsider:a5High Fever") %>%
  ggplot(aes(amce, power)) +
  geom_ribbon(aes(ymin = (power - (1.96 * `se(power)`)), 
                  ymax = (power + (1.96 * `se(power)`))), 
              alpha = 0.1) +
  geom_line() +
  geom_point() +
  facet_grid(~ as.factor(observations)) +
  geom_hline(yintercept = 0.8, col = "red", linetype = "dashed") +
  labs(title = "Power Curve by (Interaction) AMCE and Sample Size") +
  scale_x_continuous(breaks = seq(-.02, 0.12, 0.01)) +
  theme(axis.text.x = element_text(angle = 45, hjust = 1))
```


## Power Curves 6

Let's also take a look at a three level identity variable assigned with equal probability.

```{r, power.6, fig.width = 10, cache = TRUE}
rm(list = ls())

# Helper Function (Determines Binary Outcome) ##################################
determine_outcome <- function(utility) {
  round(pnorm(utility), 0)
}

# Conjoint Attributes & Associated Probabilities ###############################
a1 <- c("25", "60")
a1_prob <- c(0.5, 0.5)

a2 <- c("Man", "Woman")
a2_prob <- c(0.5, 0.5)

a3 <- c("Few Months", "Many Years")
a3_prob <- c(0.5, 0.5)

a4 <- c("Insider", "Outsider1", "Outsider2")
a4_prob <- c(1 / 3, 1 / 3, 1 / 3)

a5 <- c("Leg", "Cough Fever", "High Fever")
a5_prob <- c(1 / 3, 1 / 3, 1 / 3)


# Design Parameters ############################################################
##Simulations
n_sims <- 500

## Empty DF
results <- NULL



# Sample & Effect Sizes ########################################################
amce <- seq(-0.02, 0.12, 0.01)

n_obs <- list(2000, 4200, 11000)


# The Loop #####################################################################

for (i in amce) {
  for (j in n_obs) {
    
    df_conj <- fabricate(N = j,
                         noise = rnorm(N, sd = 1),
                         ID_label = "id")
    
    fun_assign_resp <- function(data) {
      data %>%
        mutate(a1 = complete_ra(N = j,
                                conditions = a1,
                                prob_each = a1_prob),
               a2 = complete_ra(N = j,
                                conditions = a2,
                                prob_each = a2_prob),
               a3 = complete_ra(N = j,
                                conditions = a3,
                                prob_each = a3_prob),
               a4 = complete_ra(N = j,
                                conditions = a4,
                                prob_each = a4_prob),
               a5 = complete_ra(N = j,
                                conditions = a5,
                                prob_each = a5_prob),
               vignette = paste(a1, a2, a3, a4, a5, sep = "_"))
    }
    
    assign_respondents <- declare_assignment(handler = fun_assign_resp)
    
    fun_assign_util <- function(data) {
      data %>%
        mutate(utility = 
                 qnorm(0.5 + 0.01) * (a1 == "60") +
                 qnorm(0.5 - 0.02) * (a2 == "Woman") +
                 qnorm(0.5 - 0.03) * (a3 == "Many Years") +
                 qnorm(0.5 + i) * (a4 == "Outsider1") +
                 qnorm(0.5 + i) * (a4 == "Outsider2") +
                 qnorm(0.5 + 0.35) * (a5 == "Cough Fever") +
                 qnorm(0.5 + 0.25) * (a5 == "High Fever") +
                 noise + 
                 0) %>%
        mutate(outcome = determine_outcome(utility))
    }
    
    assign_utility <- declare_step(handler = fun_assign_util)
    
    design <- declare_population(df_conj) +
      assign_respondents +
      assign_utility +
      declare_estimator(outcome ~ a1 + a2 + a3 + a4 + a5,
                        model = lm_robust,
                        se_type = "stata",
                        term = TRUE)
    
    diagnosis <- diagnose_design(design, sims = n_sims)
    
    x <- diagnosis$diagnosands_df
    x$amce <- i
    x$observations <- j
    
    results <- rbind(results, x)
  }
}

results %>%
  subset(term == "a4Outsider1") %>%
  ggplot(aes(amce, power)) +
  geom_ribbon(aes(ymin = (power - (1.96 * `se(power)`)), 
                  ymax = (power + (1.96 * `se(power)`))), 
              alpha = 0.1) +
  geom_line() +
  geom_point() +
  facet_grid(~ as.factor(observations)) +
  geom_hline(yintercept = 0.8, col = "red", linetype = "dashed") +
  labs(title = "Power Curve by AMCE and Sample Size",
       subtitle = "AMCE for 3 Level Attribute Assigned with Equal Probability") +
  scale_x_continuous(breaks = seq(-.02, 0.12, 0.01)) +
  theme(axis.text.x = element_text(angle = 45, hjust = 1))
```

